Search CORE

10 research outputs found

Domain-specific language models and lexicons for tagging

Author: Ando Rie K.
Chute Christopher G.
Coden Anni R.
Duffy Patrick H.
Pakhomov Serguei V.
Publication venue: Elsevier Inc.
Publication date
Field of study

AbstractAccurate and reliable part-of-speech tagging is useful for many Natural Language Processing (NLP) tasks that form the foundation of NLP-based approaches to information retrieval and data mining. In general, large annotated corpora are necessary to achieve desired part-of-speech tagger accuracy. We show that a large annotated general-English corpus is not sufficient for building a part-of-speech tagger model adequate for tagging documents from the medical domain. However, adding a quite small domain-specific corpus to a large general-English one boosts performance to over 92% accuracy from 87% in our studies. We also suggest a number of characteristics to quantify the similarities between a training corpus and the test data. These results give guidance for creating an appropriate corpus for building a part-of-speech tagger model that gives satisfactory accuracy results on a new domain at a relatively small cost

Elsevier - Publisher Connector

Anni R. Coden, Ph.D.

Author: Coden Anni R., Ph.D.
Publication venue: Scholarly Commons
Publication date: 01/01/2017
Field of study

Anni R. Coden, Ph.D. is currently the project manager and technical lead of the IBM Systems G Anomaly Detection Solution. Previously she led a project at IBM’s T.J. Watson Research Center on Modeling and Simulation in a Smarter Cities environment, with a focus on Emergency Response Management. Anni also managed the Medical Text and Image Analysis group. The team had a long-term collaboration with the Mayo Clinic, worked with the Memorial Sloan-Kettering Cancer Center and was also involved with academic research. Anni Coden joined IBM in 1981. Previously, she was a Researcher at the Massachusetts Institute of Technology from where she received her Ph.D. and M.S. in Computer Science. The received her M.S. in electrical engineering and her B.S. in mathematics from the Vienna University of Technology (Austria). Anni Coden has published in many areas such as theoretical computer science, computer vision, and computational linguistics and is the holder of multiple patents.https://commons.erau.edu/adfsl-bios/1000/thumbnail.jp

Embry-Riddle Aeronautical University

Morning Session 1- Keynote Speaker: Anni R. Coden

Author: Coden Anni R., Ph.D.
Publication venue: Scholarly Commons
Publication date: 15/05/2017
Field of study

Embry-Riddle Aeronautical University

Automatic Search from Streaming Data

Author: Anni R. Coden
Anni R. Coden
Eric W. Brown
Eric W. Brown
Publication venue
Publication date
Field of study

LIMITED DISTRIBUTION NOTICE: This report has been submitted for publication outside of IBM and will probably be copyrighted if accepted for publication. Ithas been issued as a Research Report for early dissemination of its contents. In view of the transfer of copyright to the outside publisher, its distribution outside of IBM prior to publication should be limited to peer communications and specific requests. After outside publication, requests should be filled only by reprints or legally obtained copies of the article (e.g., payment of royalties). Copies may be requested from IBM T. J. Watson Research Center, P

CiteSeerX

Domain-specific language models and lexicons for tagging

Author: Anni R. Coden
Anni R. Coden
Christopher G. Chute
Christopher G. Chute
Patrick H. Duffy
Patrick H. Duffy
Rie K. Ando
Rie K. Ando
Serguei V. Pakhamov
Serguei V. Pakhomov
Publication venue
Publication date: 01/01/2005
Field of study

been issued as a Research Report for early dissemination of its contents. In view of the transfer of copyright to the outside publisher, its distribution outside of IBM prior to publication should be limited to peer communications and specific requests. After outside publication, requests should be filled only by reprints or legally obtained copies of the article (e.g., payment of royalties). Copies may be requested from IBM T. J. Watson Research Center, P

CiteSeerX

ACM SIGIR 2001 workshop "Information Retrieval Techniques for Speech Applications"

Author: Allan J.
Anni R. Coden
Eric Brown
Huang J.
Ries K.
Savitha Srinivasan
Srinivasan S.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

Word sense disambiguation across two domains: Biomedical literature and clinical notes

Author: Anni R. Coden
Aronson
Carletta
Chklovski
Christopher G. Chute
Clopper
Coden
Cohen
Cohen
Friedman
Guergana K. Savova
Humphrey
Humphreys
Igor L. Sominsky
Krallinger
Leroy
Leroy
Liu
Pakhomov
Philip V. Ogren
Piet C. de Groen
Poesio
Rie Johnson
Schuemie
Schuler
Schutze
Sehgal
Shatkay
Weeber
Witten
Xu
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref